35 research outputs found

    GEML: A Grammatical Evolution, Machine Learning Approach to Multi-class Classification

    Get PDF
    In this paper, we propose a hybrid approach to solving multi-class problems which combines evolutionary computation with elements of traditional machine learning. The method, Grammatical Evolution Machine Learning (GEML) adapts machine learning concepts from decision tree learning and clustering methods and integrates these into a Grammatical Evolution framework. We investigate the effectiveness of GEML on several supervised, semi-supervised and unsupervised multi-class problems and demonstrate its competitive performance when compared with several well known machine learning algorithms. The GEML framework evolves human readable solutions which provide an explanation of the logic behind its classification decisions, offering a significant advantage over existing paradigms for unsupervised and semi-supervised learning. In addition we also examine the possibility of improving the performance of the algorithm through the application of several ensemble techniques

    Evolving clusters in gene-expression data

    No full text
    Clustering is a useful. exploratory tool for gene-expression data. Although successful applications of clustering techniques have been reported in the literature, there is no method of choice in the gene-expression analysis community. Moreover, there are only a few works that deal with the problem of automatically estimating the number of clusters in bioinformatics datasets. Most clustering methods require the number k of clusters to be either specified in advance or selected a posteriori from a set of clustering solutions over a range of k. In both cases, the user has to select the number of clusters. This paper proposes improvements to a clustering genetic algorithm that is capable of automatically discovering an optimal number of clusters and its corresponding optimal partition based upon numeric criteria. The proposed improvements are mainly designed to enhance the efficiency of the original clustering genetic algorithm, resulting in two new clustering genetic algorithms and an evolutionary algorithm for clustering (EAC). The original clustering genetic algorithm and its modified versions are evaluated in several runs using six gene-expression datasets in which the right clusters are known a priori. The results illustrate that all the proposed algorithms perform well in gene-expression data, although statistical comparisons in terms of the computational efficiency of each algorithm point out that EAC outperforms the others. Statistical evidence also shows that EAC is able to outperform a traditional method based on multiple runs of k-means over a range of k. (C) 2005 Elsevier Inc. All rights reserved.176131898192

    The construction of causal networks to estimate coral bleaching intensity

    Full text link
    Current metrics for predicting bleaching episodes, e.g. NOAA's Coral Reef Watch Program, do not seem to apply well to Brazil's marginal reefs located in Bahia state and alternative predictive approaches must be sought for effective long term management. Bleaching occurrences at Abrolhos have been observed since the 1990s but with a much lower frequency/extent than for other reef systems worldwide. We constructed a Bayesian Belief Network (BN) to back-predict the intensity of bleaching events and learn how local and regional scale forcing factors interact to enhance or alleviate coral bleaching specific to Abrolhos. Bleaching intensity data were collected for several reef sites across Bahia state coast (~12°-20°S; 37°-40°W) during the austral summer 1994-2005 and compared to environmental data: sea surface temperature (SST), diffuse light attenuation coefficient at 490 nm (K490), rain precipitation, wind velocities, and El Niño Southern Oscillation (ENSO) proxies. Conditional independence tests were calculated to produce four specialized BNs, each with specific factors that likely regulate bleaching intensity. All specialized BNs identified that a five-day accumulated SST proxy (SSTAc5d) was the exclusive parent node for coral bleaching producing a total predictive rate of 88% based on SSTAc5d state. When SSTAc5d was simulated as unknown, the Thermal-Eolic Resultant BN kept the total predictive rate of 88%. Our approach has produced initial means to predict beaching intensity at Abrolhos. However, the robustness of the model required for management purposes must be further (and regularly) operationally tested with new in situ and remote sensing data. © 2013 Elsevier Ltd

    Introducing interactive evolutionary computation in data clustering

    No full text
    Data clustering consists in finding homogeneous groups in a dataset. The importance attributed to cluster analysis is related to its fundamental role in many knowledge fields. Often data clustering techniques are the ghost host of many innovative applications for a wide range of problems (i.e. biology, marketing, customers segmentation, intelligent machines, machine translation, etc.). Recently, there is an emerging interest in Data Clustering community to develop bio-inspired algorithms in order to find new methods for clustering. It is widely observed that bio-inspired algorithms and the Evolutionary Computation (EC) techniques reach solutions similar to others computational approaches but using a bigger computational power. This limitation represents a concrete obstacle to an extensive use of Evolutionary (or bio-inspired) approach to data clustering applications. In the present paper we propose to use Interactive Evolutionary Computation (IEC) techniques where a human being (the breeder) selects Cluster configurations (genotypes) on the basis of their graphical visualizations (phenotypes). We describe a first version of a software, called Revok, that implements the IEC basic principles applied to data clustering. In the conclusion section we outline the necessary steps to reach a mature IEC tool for data clustering

    Adaptive crossover memetic differential harmony search for optimizing document clustering

    No full text
    An Adaptive Crossover Memetic Differential Harmony Search (ACMDHS) method was developed for optimizing document clustering in this paper. Due to the complexity of the documents available today, the allocation of the centroid of the document clusters and finding the optimum clusters in the search space are more complex to deal with. One of the possible enhancements on the document clustering is the use of Harmony Search (HS) algorithm to optimize the search. As HS is highly dependent on its control parameters, a differential version of HS was introduced. In the modified version of HS, the Band Width parameter (BW) has been replaced by another pitch adjustment technique due to the sensitivity of the BW parameter. Thus, the Differential Evolution (DE) mutation was used instead. In this paper the DE crossover was also used with the Differential HS for further search space exploitation, the produced global search is named Crossover DHS (CDHS). Moreover, DE crossover (Cr) and mutation (F) probabilities are dynamically tuned through generations. The Memetic optimization was used to enhance the local search capability of CDHS. The proposed ACMDHS was compared to other document clustering techniques using HS, DHS, and K-means methods. It was also compared to its other two variants which are the Memetic DHS (MDHS) and the Crossover Memetic Differential Harmony Search (CMDHS). Moreover, two state-of-the-art clustering methods were also considered in comparisons, the Chaotic Gradient Artificial Bee Colony (CGABC) and the Differential Evolution Memetic Clustering (DEMC). From the experimental results, it was shown that CMDHS variant (the non-adaptive version of ACMDHS) and ACMDHS were highly competitive while both CMDHS and ACMDHS were superior to all other methods
    corecore